A short report on different socio-economic parameters in different districts of Bihar.
Elections are round the corner in the Indian State of Bihar. One of the clients of SocialCops wants to understand the level of development in different socio-economic parameters across all the districts of Bihar. Using open data, prepare an index that measures and ranks districts of Bihar on socio-economic parameters. Also prepare a short report on the output describing each of the components of the Index.
Open data from various sources was collected, tidied up, processed and compiled into file bihar_sample.xlsx using R and Excel. After intense research and thought, 10 socio-economic parameters were selected and most of them were converted from an absolute value to percentage values in the above file. Now using R these parameters are indexed and plotted for a better understanding of socio-economic situation in different districts of Bihar.
1.Sex Ratio
The Sex ratio is the ratio of males to females in a population. The ideal sex ratio is 1:1. Due to selective terminations of pregnancy, and female infanticide this ratio has been disbalanced. Hence Government and society as a whole puts tremendous importance on the revival of a healty sex ratio and it is one of the key socio-economic paramter. Here three subparameters Sex Ratio at Birth, Sex Ratio (0-4 Years) and Sex Ratio (All Ages) are also considered. The scale is number of females per 1000 males.
First we load the required libraries xlsx and ggplot2 along with the excel file into the environment.
library(xlsx)
library(ggplot2)
bihar <- read.xlsx("bihar_sample.xlsx", sheetIndex = 1)
Now we plot the line graph, code for which is given below.
options(width = 100)
ggplot(bihar, aes(State...District,group = 1)) +
geom_line(aes(y = Sex.Ratio.at.Birth, colour = "Sex.Ratio.at.Birth")) +
geom_line(aes(y = Sex.Ratio..0..4.Years., colour = "Sex.Ratio..0..4.Years.")) +
geom_line(aes(y = Sex.Ratio..All.Ages., colour = "Sex.Ratio..All.Ages.")) + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + geom_hline(aes(yintercept=923, colour = 'Mean Sex.Ratio.at.Birth')) + geom_hline(aes(yintercept=925, colour = 'Mean Sex.Ratio..0..4.Years.')) + geom_hline(aes(yintercept=951, colour = 'Mean Sex.Ratio..All.Ages.')) + geom_point(aes(y = Sex.Ratio.at.Birth, colour = "Sex.Ratio.at.Birth")) + geom_point(aes(y = Sex.Ratio..0..4.Years., colour = "Sex.Ratio..0..4.Years.")) + geom_point(aes(y = Sex.Ratio..All.Ages., colour = "Sex.Ratio..All.Ages.")) + ggtitle("Sex Ratio at Birth") +
theme(plot.title = element_text(face="bold"))
To order different districts of Bihar on Sex Ratio we run following code. Here the districts are ranked as per their Sex Ratio [Higher is better].
sex_ratio <- bihar[ order(-bihar[,2]), ]
sex_ratio[,1:4]
## State...District Sex.Ratio.at.Birth Sex.Ratio..0..4.Years. Sex.Ratio..All.Ages.
## 8 Buxar 997 960 983
## 3 Aurangabad 985 968 1001
## 17 Kishanganj 984 988 1060
## 4 Banka 978 982 979
## 25 Pashchim Champaran 971 985 882
## 10 Gaya 970 973 1034
## 6 Bhagalpur 961 919 904
## 37 Supaul 959 971 950
## 5 Begusarai 958 953 975
## 12 Jamui 956 952 988
## 36 Siwan 945 948 961
## 19 Madhepura 938 950 916
## 34 Sheohar 937 917 939
## 13 Jehanabad 936 924 1000
## 23 Nalanda 933 945 1003
## 18 Lakhisarai 930 909 946
## 21 Munger 930 917 920
## 32 Saran 930 923 1005
## 30 Saharsa 929 920 928
## 1 Bihar 923 925 951
## 7 Bhojpur 923 917 998
## 16 Khagaria 922 938 897
## 15 Katihar 920 928 976
## 26 Patna 915 921 925
## 29 Rohtas 913 917 989
## 2 Araria 911 931 927
## 20 Madhubani 910 901 926
## 24 Nawada 909 952 1050
## 33 Sheikhpura 901 930 1014
## 11 Gopalganj 899 902 948
## 27 Purba Champaran 896 908 898
## 38 Vaishali 892 896 957
## 31 Samastipur 890 884 924
## 9 Darbhanga 886 893 923
## 22 Muzaffarpur 883 896 911
## 35 Sitamarhi 882 872 944
## 28 Purnia 878 878 938
## 14 Kaimur (Bhabua) 871 908 954
2.Literacy Rate
Literacy Rate is the total percentage of the population of an area at a particular time aged seven years or above who can read and write with understanding. Here the denominator is the population aged seven years or more. Literacy rate is also a key factor that determines the development status of an area or district. Despite government programmes, Bihar’s literacy rate increased only “sluggishly”. One of the main factors contributing to this relatively low literacy rate is the lack of proper school facilities as well as the sheer inefficiency of teaching staff across the government run education sector. There is a shortage of classrooms to accommodate all the students. In addition, there is no proper sanitation in most schools. Here two subparameters are also considered Literacy Rate (Male) and Literacy Rate (Female). The scale is in percentage.
Now we plot the line graph, code for which is given below.
ggplot(bihar, aes(State...District,group = 1)) +
geom_line(aes(y = Literacy.Rate..Total., colour = "Literacy.Rate..Total.")) +
geom_line(aes(y = Literacy.Rate..Male., colour = "Literacy.Rate..Male.")) +
geom_line(aes(y = Literacy.Rate..Female., colour = "Literacy.Rate..Female.")) + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + geom_hline(aes(yintercept=67.5, colour = 'Mean Literacy.Rate..Total')) + geom_hline(aes(yintercept=77.1, colour = 'Mean Literacy.Rate..Male.')) + geom_hline(aes(yintercept=57.6, colour = 'Mean Literacy.Rate..Female.')) + ggtitle("Literacy Rate") + geom_point(aes(y = Literacy.Rate..Total., colour = "Literacy.Rate..Total.")) + geom_point(aes(y = Literacy.Rate..Male., colour = "Literacy.Rate..Male.")) + geom_point(aes(y = Literacy.Rate..Female., colour = "Literacy.Rate..Female.")) +
theme(plot.title = element_text(face="bold"))
As we can see in the plot there is a wide gender disparity in the literacy rate in Bihar. To order different districts of Bihar on literacy rate we run following code. Here the districts are ranked as per their Literacy Rate [Higher is better].
literacy_rate <- bihar[ order(-bihar[,5]), ]
literacy_rate[,c(1,5:7)]
## State...District Literacy.Rate..Total. Literacy.Rate..Male. Literacy.Rate..Female.
## 26 Patna 78.3 86.5 69.3
## 29 Rohtas 76.5 86.8 66.4
## 21 Munger 76.1 84.6 67.2
## 14 Kaimur (Bhabua) 75.7 85.7 65.4
## 7 Bhojpur 74.3 86.7 62.0
## 8 Buxar 73.5 84.8 62.3
## 36 Siwan 73.1 81.6 64.2
## 5 Begusarai 72.6 81.2 64.2
## 3 Aurangabad 72.2 83.1 61.7
## 38 Vaishali 71.4 80.2 62.3
## 6 Bhagalpur 71.0 79.1 62.1
## 32 Saran 70.9 81.2 60.9
## 10 Gaya 69.8 80.3 60.0
## 13 Jehanabad 69.7 81.6 57.9
## 11 Gopalganj 68.6 78.7 58.4
## 33 Sheikhpura 68.3 79.6 57.5
## 18 Lakhisarai 68.1 78.4 57.7
## 22 Muzaffarpur 67.7 74.9 59.8
## 1 Bihar 67.5 77.1 57.6
## 23 Nalanda 67.5 78.6 56.8
## 16 Khagaria 67.0 75.0 58.0
## 4 Banka 65.4 76.7 54.5
## 31 Samastipur 65.2 74.7 54.8
## 30 Saharsa 64.4 75.7 52.3
## 37 Supaul 64.0 75.2 52.5
## 19 Madhepura 63.2 74.3 51.3
## 24 Nawada 63.2 74.6 53.1
## 12 Jamui 62.4 74.8 50.3
## 2 Araria 61.9 71.6 51.7
## 17 Kishanganj 61.5 70.5 53.4
## 25 Pashchim Champaran 61.4 71.4 49.8
## 27 Purba Champaran 61.4 71.0 50.9
## 20 Madhubani 61.2 71.4 50.6
## 35 Sitamarhi 61.2 69.9 52.1
## 9 Darbhanga 61.0 70.2 51.4
## 15 Katihar 60.0 67.9 52.3
## 34 Sheohar 59.8 69.5 49.6
## 28 Purnia 58.4 65.2 51.2
3.Children currently attending school
The percentage of children currrently attending school should be as high as possible because it will promote skill development and knowledge among the children. A social parameter it reflects the presence of proper schooling infrastructure, willingness of parents to send their wards to school and general level of development in the district.
Two sub parameters have also been studied here Children currently attending school (Age 6-17 years)(Male) & Children currently attending school (Age 6-17 years)(Female) . The age of children in this dataset is 6-17 years and the scale is in percentage.
Now to analyse this social parameter we will plot the line graph, code for which is given below.
ggplot(bihar, aes(State...District,group = 1)) +
geom_line(aes(y = Children.currently.attending.school..Age.6.17.years......Total., colour = "Children.currently.attending.school..Age.6.17.years......Total.")) +
geom_line(aes(y = Children.currently.attending.school..Age.6.17.years......Male., colour = "Children.currently.attending.school..Age.6.17.years......Male.")) +
geom_line(aes(y = FemalChildren.currently.attending.school..Age.6.17.years......Female., colour = "FemalChildren.currently.attending.school..Age.6.17.years......Female.")) + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + geom_hline(aes(yintercept=92.2, colour = 'Mean Children.currently.attending.school..Age.6.17.years......Total.')) + geom_hline(aes(yintercept=92.7, colour = 'Mean Children.currently.attending.school..Age.6.17.years......Male.')) + geom_hline(aes(yintercept=91.6, colour = 'Mean FemalChildren.currently.attending.school..Age.6.17.years......Female.')) + ggtitle("Children currently attending school (Age 6-17 years)") + geom_point(aes(y = Children.currently.attending.school..Age.6.17.years......Total., colour = "Children.currently.attending.school..Age.6.17.years......Total.")) + geom_point(aes(y = Children.currently.attending.school..Age.6.17.years......Male., colour = "Children.currently.attending.school..Age.6.17.years......Male.")) + geom_point(aes(y = FemalChildren.currently.attending.school..Age.6.17.years......Female., colour = "FemalChildren.currently.attending.school..Age.6.17.years......Female.")) +
theme(plot.title = element_text(face="bold")) + theme(legend.position = "bottom", legend.direction = "vertical")
Now to order different districts of Bihar on percentage of Children currently attending school we run following code. Here the districts are ranked as per their respective percentage [Higher is better].
children_school <- bihar[ order(-bihar[,8]), ]
literacy_rate[,c(1,8:10)]
## State...District Children.currently.attending.school..Age.6.17.years......Total.
## 26 Patna 91.6
## 29 Rohtas 94.1
## 21 Munger 91.9
## 14 Kaimur (Bhabua) 94.7
## 7 Bhojpur 92.0
## 8 Buxar 92.3
## 36 Siwan 96.5
## 5 Begusarai 92.2
## 3 Aurangabad 93.9
## 38 Vaishali 94.3
## 6 Bhagalpur 90.8
## 32 Saran 94.2
## 10 Gaya 88.7
## 13 Jehanabad 93.5
## 11 Gopalganj 96.0
## 33 Sheikhpura 90.0
## 18 Lakhisarai 91.1
## 22 Muzaffarpur 94.6
## 1 Bihar 92.2
## 23 Nalanda 90.3
## 16 Khagaria 93.2
## 4 Banka 90.5
## 31 Samastipur 96.6
## 30 Saharsa 88.7
## 37 Supaul 92.3
## 19 Madhepura 91.9
## 24 Nawada 90.7
## 12 Jamui 88.6
## 2 Araria 90.4
## 17 Kishanganj 88.3
## 25 Pashchim Champaran 92.9
## 27 Purba Champaran 90.3
## 20 Madhubani 92.4
## 35 Sitamarhi 90.8
## 9 Darbhanga 92.3
## 15 Katihar 89.9
## 34 Sheohar 90.2
## 28 Purnia 91.6
## Children.currently.attending.school..Age.6.17.years......Male.
## 26 91.8
## 29 94.0
## 21 92.1
## 14 95.5
## 7 93.9
## 8 93.2
## 36 96.4
## 5 92.0
## 3 93.8
## 38 93.8
## 6 91.0
## 32 94.5
## 10 91.5
## 13 94.3
## 11 96.2
## 33 91.3
## 18 92.2
## 22 94.5
## 1 92.7
## 23 91.8
## 16 93.2
## 4 91.7
## 31 96.4
## 30 90.2
## 37 93.6
## 19 92.9
## 24 92.7
## 12 91.3
## 2 91.2
## 17 88.9
## 25 93.5
## 27 90.2
## 20 93.1
## 35 90.6
## 9 93.1
## 15 90.2
## 34 90.8
## 28 91.9
## FemalChildren.currently.attending.school..Age.6.17.years......Female.
## 26 91.4
## 29 94.2
## 21 91.7
## 14 93.9
## 7 90.0
## 8 91.3
## 36 96.6
## 5 92.4
## 3 94.0
## 38 94.9
## 6 90.5
## 32 93.8
## 10 85.8
## 13 92.7
## 11 95.8
## 33 88.7
## 18 89.8
## 22 94.8
## 1 91.6
## 23 88.8
## 16 93.1
## 4 89.3
## 31 96.8
## 30 87.1
## 37 90.8
## 19 90.8
## 24 88.7
## 12 85.7
## 2 89.6
## 17 87.6
## 25 92.2
## 27 90.4
## 20 91.6
## 35 91.1
## 9 91.4
## 15 89.6
## 34 89.6
## 28 91.3
4.Children engaged in work
The percentage of children engaged in work also refferd to as Child Labour refers to the employment of children in any work that deprives children of their childhood, interferes with their ability to attend regular school, and that is mentally, physically, socially or morally dangerous and harmful. This practice is considered exploitative by many international organisations.Poverty is the greatest single cause behind child labour. This socio-economic parameter gives a holistic view of povert and scope of development in the district or state. Subparameters Children aged 5-14 years engaged in work (%)(Total)(Male) & Children aged 5-14 years engaged in work (%)(Total)(Female) have also been considerd. The age of children in this dataset is 5-14 years and the scale is in percentage.
Now to analyse this social parameter we will plot the line graph, code for which is given below.
ggplot(bihar, aes(State...District,group = 1)) +
geom_line(aes(y = Children.aged.5.14.years.engaged.in.work.....Total., colour = "Children.aged.5.14.years.engaged.in.work.....Total.")) +
geom_line(aes(y = Children.aged.5.14.years.engaged.in.work.....Total..Male., colour = "Children.aged.5.14.years.engaged.in.work.....Total..Male.")) +
geom_line(aes(y = Children.aged.5.14.years.engaged.in.work.....Total..Female., colour = "Children.aged.5.14.years.engaged.in.work.....Total..Female.")) + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + geom_hline(aes(yintercept=3, colour = 'Mean Children.aged.5.14.years.engaged.in.work.....Total.')) + geom_hline(aes(yintercept=3.5, colour = 'Mean Children.aged.5.14.years.engaged.in.work.....Total..Male.')) + geom_hline(aes(yintercept=2.6, colour = 'Mean Children.aged.5.14.years.engaged.in.work.....Total..Female.')) + ggtitle("Children aged 5-14 years engaged in work (Total)") + geom_point(aes(y = Children.aged.5.14.years.engaged.in.work.....Total., colour = "Children.aged.5.14.years.engaged.in.work.....Total.")) + geom_point(aes(y = Children.aged.5.14.years.engaged.in.work.....Total..Male., colour = "Children.aged.5.14.years.engaged.in.work.....Total..Male.")) + geom_point(aes(y = Children.aged.5.14.years.engaged.in.work.....Total..Female., colour = "Children.aged.5.14.years.engaged.in.work.....Total..Female.")) +
theme(plot.title = element_text(face="bold")) + theme(legend.position = "bottom", legend.direction = "vertical")
Now to order different districts of Bihar on percentage of Children engaged in work we run following code. Here the districts are ranked as per their respective percentage [Lower is better].
children_work <- bihar[ order(bihar[,11]), ]
children_work[,c(1,11:13)]
## State...District Children.aged.5.14.years.engaged.in.work.....Total.
## 29 Rohtas 1.1
## 14 Kaimur (Bhabua) 1.2
## 3 Aurangabad 1.4
## 8 Buxar 1.4
## 7 Bhojpur 1.6
## 13 Jehanabad 1.6
## 18 Lakhisarai 1.8
## 5 Begusarai 1.9
## 21 Munger 1.9
## 36 Siwan 1.9
## 12 Jamui 2.0
## 38 Vaishali 2.0
## 26 Patna 2.1
## 27 Purba Champaran 2.2
## 6 Bhagalpur 2.3
## 4 Banka 2.4
## 11 Gopalganj 2.4
## 16 Khagaria 2.4
## 25 Pashchim Champaran 2.4
## 10 Gaya 2.5
## 31 Samastipur 2.7
## 24 Nawada 2.8
## 30 Saharsa 2.9
## 1 Bihar 3.0
## 19 Madhepura 3.0
## 22 Muzaffarpur 3.0
## 35 Sitamarhi 3.1
## 20 Madhubani 3.3
## 32 Saran 3.4
## 33 Sheikhpura 3.7
## 37 Supaul 3.8
## 15 Katihar 4.0
## 23 Nalanda 4.3
## 34 Sheohar 4.6
## 2 Araria 4.9
## 9 Darbhanga 6.7
## 28 Purnia 8.7
## 17 Kishanganj 9.1
## Children.aged.5.14.years.engaged.in.work.....Total..Male.
## 29 1.5
## 14 1.7
## 3 1.8
## 8 2.1
## 7 2.1
## 13 2.1
## 18 2.3
## 5 2.6
## 21 2.5
## 36 2.3
## 12 2.3
## 38 2.4
## 26 3.0
## 27 2.5
## 6 2.8
## 4 3.5
## 11 2.9
## 16 3.1
## 25 3.1
## 10 3.4
## 31 3.1
## 24 3.1
## 30 3.2
## 1 3.5
## 19 2.8
## 22 3.4
## 35 3.4
## 20 4.0
## 32 3.5
## 33 3.8
## 37 3.7
## 15 4.9
## 23 4.2
## 34 4.6
## 2 5.1
## 9 6.2
## 28 9.5
## 17 9.5
## Children.aged.5.14.years.engaged.in.work.....Total..Female.
## 29 0.7
## 14 0.6
## 3 0.8
## 8 0.7
## 7 1.0
## 13 1.1
## 18 1.3
## 5 1.1
## 21 1.3
## 36 1.4
## 12 1.7
## 38 1.5
## 26 1.1
## 27 1.8
## 6 1.8
## 4 1.2
## 11 1.9
## 16 1.7
## 25 1.5
## 10 1.6
## 31 2.3
## 24 2.6
## 30 2.5
## 1 2.6
## 19 3.2
## 22 2.7
## 35 2.7
## 20 2.6
## 32 3.2
## 33 3.6
## 37 4.0
## 15 3.1
## 23 4.3
## 34 4.7
## 2 4.7
## 9 7.3
## 28 7.9
## 17 8.7
5.Prevlance of any type of disablity
Disability is the consequence of an impairment that may be physical, cognitive, mental, sensory, emotional, developmental, or some combination of these. A disability may be present from birth, or occur during a person’s lifetime. Individuals may also qualify as disabled if they have had an impairment in the past or are seen as disabled based on a personal or group standard or norm. Such impairments may include physical, sensory, and cognitive or developmental disabilities. Mental disorders (also known as psychiatric or psychosocial disability) and various types of chronic disease may also qualify as disabilities. Prevlance of disablity is an important socio-economic parameter that reflects the status of health infrastructure, level of nutrition and number of able bodied men & women.
Subparameters Prevalence of any type of Disability (Total)(Male) & Prevalence of any type of Disability (Total)(Female) have also been examined. The scale here is number per 100,000 population.
Now to analyse this social parameter we will plot the line graph, code for which is given below.
ggplot(bihar, aes(State...District,group = 1)) +
geom_line(aes(y = Prevalence.of.any.type.of.Disability..Per.100.000.Population..Total., colour = "Prevalence.of.any.type.of.Disability..Per.100.000.Population..Total.")) +
geom_line(aes(y = Prevalence.of.any.type.of.Disability..Per.100.000.Population..Total..Male., colour = "Prevalence.of.any.type.of.Disability..Per.100.000.Population..Total..Male.")) +
geom_line(aes(y = Prevalence.of.any.type.of.Disability..Per.100.000.Population..Total..Female., colour = "Prevalence.of.any.type.of.Disability..Per.100.000.Population..Total..Female.")) + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + geom_hline(aes(yintercept=1617, colour = 'Mean Prevalence.of.any.type.of.Disability..Per.100.000.Population..Total.')) + geom_hline(aes(yintercept=1958, colour = 'Mean Prevalence.of.any.type.of.Disability..Per.100.000.Population..Total..Male.')) + geom_hline(aes(yintercept=1262, colour = 'Mean Prevalence.of.any.type.of.Disability..Per.100.000.Population..Total..Female.')) + ggtitle("Prevalence of any type of Disability (Per 100,000 Population)(Total)") + geom_point(aes(y = Prevalence.of.any.type.of.Disability..Per.100.000.Population..Total., colour = "Prevalence.of.any.type.of.Disability..Per.100.000.Population..Total.")) + geom_point(aes(y = Prevalence.of.any.type.of.Disability..Per.100.000.Population..Total..Male., colour = "Prevalence.of.any.type.of.Disability..Per.100.000.Population..Total..Male.")) + geom_point(aes(y = Prevalence.of.any.type.of.Disability..Per.100.000.Population..Total..Female., colour = "Prevalence.of.any.type.of.Disability..Per.100.000.Population..Total..Female.")) +
theme(plot.title = element_text(face="bold")) + theme(legend.position = "bottom", legend.direction = "vertical")
Now to order different districts of Bihar on prevlance of any type of disablity we run following code. Here the districts are ranked as per their prevlance of any type of disablity. [Lower is better].
disablity <- bihar[ order(bihar[,14]), ]
disablity[,c(1,14:16)]
## State...District Prevalence.of.any.type.of.Disability..Per.100.000.Population..Total.
## 36 Siwan 1167
## 19 Madhepura 1234
## 17 Kishanganj 1236
## 26 Patna 1255
## 12 Jamui 1417
## 8 Buxar 1428
## 7 Bhojpur 1435
## 38 Vaishali 1437
## 4 Banka 1443
## 27 Purba Champaran 1472
## 37 Supaul 1475
## 14 Kaimur (Bhabua) 1492
## 5 Begusarai 1493
## 21 Munger 1530
## 29 Rohtas 1556
## 18 Lakhisarai 1589
## 22 Muzaffarpur 1591
## 28 Purnia 1601
## 32 Saran 1603
## 1 Bihar 1617
## 35 Sitamarhi 1622
## 11 Gopalganj 1630
## 30 Saharsa 1648
## 2 Araria 1660
## 31 Samastipur 1687
## 15 Katihar 1708
## 6 Bhagalpur 1728
## 34 Sheohar 1752
## 33 Sheikhpura 1835
## 25 Pashchim Champaran 1836
## 13 Jehanabad 1838
## 9 Darbhanga 1880
## 20 Madhubani 1914
## 16 Khagaria 1942
## 10 Gaya 1985
## 24 Nawada 1996
## 23 Nalanda 2177
## 3 Aurangabad 2223
## Prevalence.of.any.type.of.Disability..Per.100.000.Population..Total..Male.
## 36 1406
## 19 1442
## 17 1386
## 26 1513
## 12 1737
## 8 1688
## 7 1791
## 38 1726
## 4 1859
## 27 1762
## 37 1768
## 14 1850
## 5 1788
## 21 1929
## 29 1895
## 18 2001
## 22 1864
## 28 1945
## 32 2066
## 1 1958
## 35 1879
## 11 1920
## 30 1983
## 2 1952
## 31 2089
## 15 2106
## 6 2087
## 34 2196
## 33 2204
## 25 2220
## 13 2281
## 9 2358
## 20 2271
## 16 2341
## 10 2422
## 24 2483
## 23 2679
## 3 2647
## Prevalence.of.any.type.of.Disability..Per.100.000.Population..Total..Female.
## 36 919
## 19 1013
## 17 1099
## 26 970
## 12 1101
## 8 1168
## 7 1077
## 38 1137
## 4 1035
## 27 1156
## 37 1171
## 14 1122
## 5 1201
## 21 1107
## 29 1217
## 18 1166
## 22 1289
## 28 1236
## 32 1143
## 1 1262
## 35 1348
## 11 1331
## 30 1289
## 2 1353
## 31 1247
## 15 1315
## 6 1335
## 34 1283
## 33 1475
## 25 1404
## 13 1392
## 9 1377
## 20 1537
## 16 1501
## 10 1567
## 24 1549
## 23 1684
## 3 1809
6.Children Immunized
Immunization is the process whereby a person is made immune or resistant to an infectious disease, typically by the administration of a vaccine. Vaccines stimulate the body’s own immune system to protect the person against subsequent infection or disease. This socio-economic parameter refers to the percentage of children who have been administered a vaccine. Immunization is important as it prevents us from getting sick and makes us resistant from several diseases. Proper immunization leads to a healty workforce and human capital in future while reflecting upon the present healthcare infrastructure and its reach. The age of children in this dataset is 12-23 months and the scale is in percentage.
Now to analyse this social parameter we will plot the bar graph, code for which is given below.
ggplot(bihar, aes(State...District,group = 1,y = Children.Immunized..Age.12.23.months....,label= Children.Immunized..Age.12.23.months....)) +
geom_bar(stat = "identity", aes(y = Children.Immunized..Age.12.23.months...., colour = "Children.Immunized..Age.12.23.months....")) + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + geom_hline(aes(yintercept=65.6, colour = 'Mean Children Immunized')) + ggtitle("Children Immunized (Age 12-23 months)(%)") +
theme(plot.title = element_text(face="bold")) + theme(legend.position = "bottom", legend.direction = "vertical")+ geom_text(size = 3, colour = 'white', vjust=2)
Now to order different districts of Bihar on percentage of Children Immunized we run following code. Here the districts are ranked as per their respective percentage. [Higher is better].
immunization <- bihar[ order(-bihar[,17]), ]
immunization[,c(1,17)]
## State...District Children.Immunized..Age.12.23.months....
## 20 Madhubani 82.8
## 28 Purnia 79.9
## 16 Khagaria 79.4
## 32 Saran 76.4
## 21 Munger 75.9
## 31 Samastipur 75.8
## 13 Jehanabad 74.9
## 33 Sheikhpura 74.9
## 3 Aurangabad 74.2
## 36 Siwan 73.3
## 19 Madhepura 73.2
## 4 Banka 71.7
## 26 Patna 71.5
## 14 Kaimur (Bhabua) 71.0
## 37 Supaul 70.5
## 38 Vaishali 70.3
## 11 Gopalganj 69.8
## 34 Sheohar 68.5
## 30 Saharsa 68.1
## 23 Nalanda 67.1
## 7 Bhojpur 67.0
## 29 Rohtas 66.9
## 6 Bhagalpur 65.7
## 1 Bihar 65.6
## 8 Buxar 64.7
## 35 Sitamarhi 64.4
## 24 Nawada 63.1
## 18 Lakhisarai 62.9
## 15 Katihar 62.3
## 5 Begusarai 62.0
## 10 Gaya 61.7
## 9 Darbhanga 60.4
## 22 Muzaffarpur 58.0
## 2 Araria 54.5
## 25 Pashchim Champaran 52.2
## 27 Purba Champaran 42.0
## 12 Jamui 38.8
## 17 Kishanganj 26.6
7.Crude Death Rate
Crude death rate indicates the number of deaths occurring during the year, per 1,000 population estimated at midyear. It is also know as mortality rate. The crude death rate depends on the age (and gender) specific mortality rates and the age (and gender) distribution of the population. The number of deaths per 1,000 people can be higher in developed nations than in less-developed countries, despite a higher life expectancy in developed countries due to better standards of health. This happens because developed countries typically have a much higher proportion of older people, due to both lower birth rates and lower mortality rates. This socio-economic parameter is also important because it reflects the state of healtcare infrastructure and the growth rate of population in a region or district. Here the scale is number per 1000 population.
Now to analyse this social parameter we will plot the bar graph, code for which is given below.
ggplot(bihar, aes(State...District,group = 1,y = Crude.Death.Rate..Per.1000.population.,label= Crude.Death.Rate..Per.1000.population.)) +
geom_bar(stat = "identity", aes(y = Crude.Death.Rate..Per.1000.population., colour = "Crude.Death.Rate..Per.1000.population.")) + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + geom_hline(aes(yintercept=7, colour = 'Mean Crude Death Rate (Per 1000 population')) + ggtitle("Crude Death Rate (Per 1000 population)") +
theme(plot.title = element_text(face="bold")) + theme(legend.position = "bottom", legend.direction = "vertical")+ geom_text(size = 3, colour = 'white', vjust=2)
Now to order different districts of Bihar on the basis of crude death rate we run following code. Here the districts are ranked as per their crude death rate. [Lower is better].
crude <- bihar[ order(bihar[,18]), ]
crude[,c(1,18)]
## State...District Crude.Death.Rate..Per.1000.population.
## 26 Patna 5.0
## 6 Bhagalpur 5.2
## 7 Bhojpur 5.5
## 14 Kaimur (Bhabua) 5.5
## 4 Banka 5.7
## 24 Nawada 5.7
## 13 Jehanabad 5.8
## 3 Aurangabad 6.0
## 5 Begusarai 6.2
## 12 Jamui 6.2
## 18 Lakhisarai 6.2
## 11 Gopalganj 6.3
## 8 Buxar 6.4
## 15 Katihar 6.4
## 17 Kishanganj 6.4
## 21 Munger 6.4
## 37 Supaul 6.4
## 29 Rohtas 6.6
## 31 Samastipur 6.7
## 1 Bihar 7.0
## 10 Gaya 7.0
## 28 Purnia 7.1
## 19 Madhepura 7.2
## 20 Madhubani 7.2
## 36 Siwan 7.3
## 38 Vaishali 7.4
## 34 Sheohar 7.5
## 2 Araria 7.6
## 30 Saharsa 7.6
## 32 Saran 7.6
## 23 Nalanda 7.7
## 27 Purba Champaran 7.8
## 33 Sheikhpura 7.8
## 9 Darbhanga 8.6
## 22 Muzaffarpur 8.6
## 25 Pashchim Champaran 8.7
## 16 Khagaria 9.3
## 35 Sitamarhi 9.3
8.Infant Mortality Rate
Infant mortality is the death of a child less than one year of age. It is measured as infant mortality rate (IMR), which is the number of deaths of children under one year of age per 1000 live births.The leading causes of infant mortality are birth asphyxia, pneumonia, pre-term birth complications, diarrhoea, malaria, measles and malnutrition. Many factors contribute to infant mortality such as the mother’s level of education, environmental conditions, and political and medical infrastructure. Improving sanitation, access to clean drinking water, immunization against infectious diseases, and other public health measures could help reduce high rates of infant mortality.Here the scale is number per 1000 population.
Now to analyse this social parameter we will plot the bar graph, code for which is given below.
ggplot(bihar, aes(State...District,group = 1,y = Infant.Mortality.Rate..Per.1000.Population.,label= Infant.Mortality.Rate..Per.1000.Population.)) +
geom_bar(stat = "identity", aes(y = Infant.Mortality.Rate..Per.1000.Population., colour = "Infant.Mortality.Rate..Per.1000.Population.")) + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + geom_hline(aes(yintercept=52, colour = 'Mean Infant.Mortality.Rate..Per.1000.Population.')) + ggtitle("Infant Mortality Rate (Per 1000 Population)") +
theme(plot.title = element_text(face="bold")) + theme(legend.position = "bottom", legend.direction = "vertical")+ geom_text(size = 3, colour = 'white', vjust=2)
Now to order different districts of Bihar on the basis of infant mortality rate we run following code. Here the districts are ranked as per their infant mortality rate. [Lower is better].
infant <- bihar[ order(bihar[,19]), ]
infant[,c(1,19)]
## State...District Infant.Mortality.Rate..Per.1000.Population.
## 26 Patna 37
## 5 Begusarai 43
## 3 Aurangabad 44
## 7 Bhojpur 44
## 4 Banka 45
## 36 Siwan 46
## 24 Nawada 47
## 34 Sheohar 47
## 38 Vaishali 47
## 9 Darbhanga 48
## 11 Gopalganj 48
## 21 Munger 48
## 23 Nalanda 49
## 29 Rohtas 49
## 18 Lakhisarai 50
## 6 Bhagalpur 51
## 13 Jehanabad 51
## 32 Saran 51
## 1 Bihar 52
## 10 Gaya 52
## 20 Madhubani 52
## 31 Samastipur 52
## 8 Buxar 53
## 14 Kaimur (Bhabua) 53
## 25 Pashchim Champaran 53
## 27 Purba Champaran 53
## 12 Jamui 54
## 15 Katihar 55
## 2 Araria 56
## 33 Sheikhpura 56
## 22 Muzaffarpur 57
## 17 Kishanganj 58
## 28 Purnia 58
## 30 Saharsa 59
## 37 Supaul 61
## 16 Khagaria 63
## 35 Sitamarhi 64
## 19 Madhepura 68
9.Household Members Seeking Work
Household members seeking work may refer to simply the unemploymeny percentage in the district or state. It is a measure of the prevalence of unemployment and it is calculated as a percentage by dividing the number of unemployed individuals by all individuals currently in the labor force. Causes for this vary from unionization, bureaucratic work rules, minimum wage laws and taxes. It is a very important socio-economic parameter as it essentially tells the shape of local economy, ease of doing work, structure of taxes and the general economic growth. This parameter is of paramount importance for accessing the economic condition of a region or district.
Further study of three more sub-parameters was done which were Household members seeking work(%) (2),Household members seeking work(%) (3) & Household members seeking work(%) (4+). The scale here is in percentage.
Now to analyse this social parameter we will plot the line graph, code for which is given below.
ggplot(bihar, aes(State...District,group = 1)) +
geom_line(aes(y = Household.members.seeking.work.....1., colour = "Household.members.seeking.work.....1.")) +
geom_line(aes(y = Household.members.seeking.work.....2., colour = "Household.members.seeking.work.....2.")) +
geom_line(aes(y = Household.members.seeking.work.....3., colour = "Household.members.seeking.work.....3.")) +
geom_line(aes(y = Household.members.seeking.work.....4.., colour = "Household.members.seeking.work.....4..")) + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + geom_hline(aes(yintercept=19.26, colour = 'Mean Household.members.seeking.work.....1.')) + geom_hline(aes(yintercept=9.68, colour = 'Mean Household.members.seeking.work.....2.')) + geom_hline(aes(yintercept=3.18, colour = 'Mean Household.members.seeking.work.....3.')) + geom_hline(aes(yintercept=2.58, colour = 'Mean Household.members.seeking.work.....4..')) + ggtitle("Number of Household Members Seeking Work(%)" ) + geom_point(aes(y = Household.members.seeking.work.....1., colour = "Household.members.seeking.work.....1.")) + geom_point(aes(y = Household.members.seeking.work.....2., colour = "Household.members.seeking.work.....2.")) + geom_point(aes(y = Household.members.seeking.work.....3., colour = "Household.members.seeking.work.....3.")) + geom_point(aes(y = Household.members.seeking.work.....4.., colour = "Household.members.seeking.work.....4..")) + theme(legend.position = "bottom", legend.direction = "vertical")
Now to order different districts of Bihar on prevlance of any type of unemployment/number of household seeking work we run following code. Here the districts are ranked as per their respective percentage. [Lower is better].
household <- bihar[ order(bihar[,21]), ]
household[,c(1,21:24)]
## State...District Household.members.seeking.work.....1. Household.members.seeking.work.....2.
## 17 Kishanganj 14.19 5.20
## 23 Nalanda 15.55 8.69
## 24 Nawada 15.58 9.04
## 33 Sheikhpura 15.60 9.12
## 13 Jehanabad 15.93 8.96
## 12 Jamui 16.21 9.00
## 10 Gaya 16.57 10.32
## 2 Araria 16.88 7.96
## 26 Patna 17.23 8.92
## 27 Purba Champaran 17.67 7.64
## 15 Katihar 17.84 8.23
## 28 Purnia 17.95 8.37
## 19 Madhepura 18.17 10.90
## 35 Sitamarhi 18.17 6.95
## 20 Madhubani 18.47 8.55
## 8 Buxar 18.67 10.01
## 7 Bhojpur 18.78 10.17
## 14 Kaimur (Bhabua) 19.04 10.61
## 34 Sheohar 19.11 7.29
## 3 Aurangabad 19.22 10.96
## 1 Bihar 19.26 9.68
## 18 Lakhisarai 19.31 11.15
## 31 Samastipur 19.71 9.01
## 38 Vaishali 19.71 8.74
## 37 Supaul 19.72 11.00
## 25 Pashchim Champaran 19.92 11.16
## 29 Rohtas 20.10 11.10
## 4 Banka 20.71 12.50
## 11 Gopalganj 20.94 9.87
## 30 Saharsa 21.13 11.68
## 32 Saran 21.31 10.22
## 5 Begusarai 21.47 10.35
## 9 Darbhanga 21.63 9.72
## 36 Siwan 22.55 10.90
## 16 Khagaria 22.77 12.09
## 22 Muzaffarpur 23.13 11.30
## 6 Bhagalpur 23.26 13.66
## 21 Munger 24.08 12.55
## Household.members.seeking.work.....3. Household.members.seeking.work.....4..
## 17 1.70 1.06
## 23 3.20 3.14
## 24 3.59 3.90
## 33 3.46 3.59
## 13 3.39 3.38
## 12 2.98 2.82
## 10 3.96 4.53
## 2 2.16 1.42
## 26 3.69 3.18
## 27 2.29 1.55
## 15 2.38 1.52
## 28 2.54 1.75
## 19 2.55 2.04
## 35 1.79 1.03
## 20 2.46 1.70
## 8 4.00 3.60
## 7 4.33 4.22
## 14 3.68 3.59
## 34 1.55 0.83
## 3 4.26 4.46
## 1 3.18 2.58
## 18 4.21 3.85
## 31 2.56 1.70
## 38 3.08 2.16
## 37 2.78 2.15
## 25 3.59 2.98
## 29 4.39 4.04
## 4 3.96 3.49
## 11 3.98 3.32
## 30 2.96 2.24
## 32 4.17 3.56
## 5 3.03 2.11
## 9 2.59 1.57
## 36 4.39 3.58
## 16 3.10 2.18
## 22 3.51 2.40
## 6 4.31 3.54
## 21 4.17 2.73
10.Student Teacher Ratio
Student-teacher ratio or student-faculty ratio is the number of students who attend a school or university divided by the number of teachers in the institution. For example, a student-teacher ratio of 10:1 indicates that there are 10 students for every one teacher. Factors that can affect the relationship between student-teacher ratio and class size include the number of teachers with non-teaching duties, the number of classes per teacher, and the number of teachers per class. Classes with too many students are often disrupting to education. Also, too many students in a class results in a diverse field of students, with varying degrees of learning ability. Consequently, the class will spend time for less academic students to assimilate the information, when that time could be better spent progressing through the curriculum. Hence to promote learning a lower student teacher ratio should be maintained. This socio-economic parameter tells us about the state of education, schooling infrastructure, availablity of teachers, etc. The scale here is in ratio.
Now to analyse this socio-economic parameter we will plot the bar graph, code for which is given below.
ggplot(bihar, aes(State...District,group = 1,y = Student.Teacher.Ratio,label= Student.Teacher.Ratio)) +
geom_bar(stat = "identity", aes(y = Student.Teacher.Ratio, colour = "Student.Teacher.Ratio")) + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + geom_hline(aes(yintercept=69.21, colour = 'Mean Student.Teacher.Ratio')) + ggtitle("Student Teacher Ratio") +
theme(plot.title = element_text(face="bold")) + theme(legend.position = "bottom", legend.direction = "vertical")+ geom_text(size = 3, colour = 'red', vjust=2)
As we can observe the student teacher ratio is poor in the entire state, especially in Jamui district. Now to order different districts of Bihar on the basis of student teacher ratio we run following code. Here the districts are ranked as per their respective student teacher ratio. [Lower is better].
student <- bihar[ order(bihar[,25]), ]
student[,c(1,25)]
## State...District Student.Teacher.Ratio
## 18 Lakhisarai 34.25
## 15 Katihar 43.45
## 32 Saran 44.10
## 38 Vaishali 46.39
## 2 Araria 50.39
## 5 Begusarai 50.66
## 22 Muzaffarpur 51.85
## 7 Bhojpur 53.50
## 8 Buxar 56.11
## 34 Sheohar 56.11
## 37 Supaul 56.37
## 23 Nalanda 56.85
## 26 Patna 59.34
## 13 Jehanabad 60.33
## 33 Sheikhpura 63.62
## 14 Kaimur (Bhabua) 66.76
## 21 Munger 67.16
## 35 Sitamarhi 68.15
## 1 Bihar 69.21
## 28 Purnia 69.25
## 10 Gaya 72.27
## 24 Nawada 78.22
## 27 Purba Champaran 80.63
## 20 Madhubani 82.86
## 29 Rohtas 84.11
## 4 Banka 86.66
## 16 Khagaria 88.76
## 31 Samastipur 88.86
## 36 Siwan 92.21
## 25 Pashchim Champaran 93.60
## 6 Bhagalpur 93.80
## 9 Darbhanga 109.26
## 19 Madhepura 109.35
## 30 Saharsa 109.35
## 11 Gopalganj 116.75
## 3 Aurangabad 126.66
## 17 Kishanganj 135.53
## 12 Jamui 367.09
The given problem statement was solved and a comprehensive report is presented above on 10 socio-economic parameters.
*The image & excel file(s) used are also attached. This file needs to be placed in the same folder with the “bihar_sample.xlsx” for it to work.